Improving upon the efficiency of complete case analysis when covariates are MNAR

نویسندگان

  • Jonathan W. Bartlett
  • James R. Carpenter
  • Kate Tilling
  • Stijn Vansteelandt
چکیده

Missing values in covariates of regression models are a pervasive problem in empirical research. Popular approaches for analyzing partially observed datasets include complete case analysis (CCA), multiple imputation (MI), and inverse probability weighting (IPW). In the case of missing covariate values, these methods (as typically implemented) are valid under different missingness assumptions. In particular, CCA is valid under missing not at random (MNAR) mechanisms in which missingness in a covariate depends on the value of that covariate, but is conditionally independent of outcome. In this paper, we argue that in some settings such an assumption is more plausible than the missing at random assumption underpinning most implementations of MI and IPW. When the former assumption holds, although CCA gives consistent estimates, it does not make use of all observed information. We therefore propose an augmented CCA approach which makes the same conditional independence assumption for missingness as CCA, but which improves efficiency through specification of an additional model for the probability of missingness, given the fully observed variables. The new method is evaluated using simulations and illustrated through application to data on reported alcohol consumption and blood pressure from the US National Health and Nutrition Examination Survey, in which data are likely MNAR independent of outcome.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corrigendum: Improving upon the efficiency of complete case analysis when covariates are MNAR (10.1093/biostatistics/kxu023).

2. APPLICATION TO NHANES: CORRECTION OF VARIABLE DEFINITION The main manuscript stated that loge(average no. drinks per day + 1) was used as covariate in the model of interest in the analysis of data from NHANES. This was incorrect: loge(average no. drinks per day) was used as covariate. For the ACC and MI estimators, the number of drinks minus one was imputed using negative binomial regression...

متن کامل

Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study

BACKGROUND There is no consensus on the most appropriate approach to handle missing covariate data within prognostic modelling studies. Therefore a simulation study was performed to assess the effects of different missing data techniques on the performance of a prognostic model. METHODS Datasets were generated to resemble the skewed distributions seen in a motivating breast cancer example. Mu...

متن کامل

Prediction of mental disorders after Mild Traumatic Brain Injury: principle component Approach

Introduction: In Processes Modeling, when there is relatively a high correlation between covariates, multicollinearity is created, and it leads to reduction in model's efficiency. In this study, by using principle component analysis, modification of the effect of multicolinearity in Artificial Neural Network (ANN) and Logistic Regression (LR) has been studied. Also, the effect of multicolineari...

متن کامل

Diagnosing Global Case Influence on MAR Versus MNAR Model Comparisons

When missingness is suspected to be not at random (MNAR) in longitudinal studies, researchers sometimes compare the fit of a target model that assumes missingness at random (here termed a MAR model) and a model that accommodates a hypothesized MNAR missingness mechanism (here termed a MNAR model). It is well known that such comparisons are only interpretable conditional on the validity of the c...

متن کامل

مقایسه روش بیزی (Bayesian) و کلاسیک در برآرد پارامترهای مدل رگرسیون لجستیک با وجود مقادیر گمشده در متغیرهای کمکی

Background and Aim: Logistic regression is an analytic tool widely used in medical and epidemiologic research. In many studies, we face data sets in which some of the data are not recorded. A simple way to deal with such "missing data" is to simply ignore the subjects with missing observations, and perform the analysis on cases for which complete data are available. Materials and Methods: We c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2014